Skip to content

Avoid parsing skipped range requests in ChunkedStreamManager (PR 10694 follow-up)#20652

Draft
Snuffleupagus wants to merge 1 commit intomozilla:masterfrom
Snuffleupagus:ChunkedStream-sendRequest-skip-empty
Draft

Avoid parsing skipped range requests in ChunkedStreamManager (PR 10694 follow-up)#20652
Snuffleupagus wants to merge 1 commit intomozilla:masterfrom
Snuffleupagus:ChunkedStream-sendRequest-skip-empty

Conversation

@Snuffleupagus
Copy link
Collaborator

While we don't dispatch the actual range request after PR #10694 we still parse the returned data, which ends up being an empty ArrayBuffer and thus cannot affect the ChunkedStream.prototype._loadedChunks property.
Given that no actual data arrived, it's thus pointless[1] to invoke the ChunkedStreamManager.prototype.onReceiveData method in this case (and it also avoids sending effectively duplicate "DocProgress" messages).


[1] With the possible exception of disableAutoFetch === false being set, see

// If there are no pending requests, automatically fetch the next
// unfetched chunk of the PDF file.
if (!this.disableAutoFetch && this._requestsByChunk.size === 0) {
let nextEmptyChunk;
if (stream.numChunksLoaded === 1) {
// This is a special optimization so that after fetching the first
// chunk, rather than fetching the second chunk, we fetch the last
// chunk.
const lastChunk = stream.numChunks - 1;
if (!stream.hasChunk(lastChunk)) {
nextEmptyChunk = lastChunk;
}
} else {
nextEmptyChunk = stream.nextEmptyChunk(endChunk);
}
if (Number.isInteger(nextEmptyChunk)) {
this._requestChunks([nextEmptyChunk]);
}
}
however that never happens when streaming is being used; note

pdf.js/src/core/worker.js

Lines 237 to 238 in f24768d

// We don't need auto-fetch when streaming is enabled.
pdfManagerArgs.disableAutoFetch ||= fullReader.isStreamingSupported;

…694 follow-up)

While we don't dispatch the actual range request after PR 10694 we still parse the returned data, which ends up being an *empty* `ArrayBuffer` and thus cannot affect the `ChunkedStream.prototype._loadedChunks` property.
Given that no actual data arrived, it's thus pointless[1] to invoke the `ChunkedStreamManager.prototype.onReceiveData` method in this case (and it also avoids sending effectively duplicate "DocProgress" messages).

---
[1] With the *possible* exception of `disableAutoFetch === false` being set, see https://github.com/mozilla/pdf.js/blob/f24768d7b4ef0f64976a7f9e80e15ef50c759356/src/core/chunked_stream.js#L499-L517 however that never happens when streaming is being used; note https://github.com/mozilla/pdf.js/blob/f24768d7b4ef0f64976a7f9e80e15ef50c759356/src/core/worker.js#L237-L238
@Snuffleupagus
Copy link
Collaborator Author

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/da63b1de95195a6/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/71aa23b7d82a447/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/71aa23b7d82a447/output.txt

Total script time: 60.00 mins

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/da63b1de95195a6/output.txt

Total script time: 81.62 mins

  • Unit tests: FAILED
  • Integration Tests: FAILED
  • Regression tests: FAILED
  different ref/snapshot: 1

Image differences available at: http://54.193.163.58:8877/da63b1de95195a6/reftest-analyzer.html#web=eq.log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants